Goto

Collaborating Authors

 Aosta Valley


Understanding the Geospatial Reasoning Capabilities of LLMs: A Trajectory Recovery Perspective

Truong, Thinh Hung, Lau, Jey Han, Qi, Jianzhong

arXiv.org Artificial Intelligence

We explore the geospatial reasoning capabilities of Large Language Models (LLMs), specifically, whether LLMs can read road network maps and perform navigation. Using road network as context, our prompting framework enables LLMs to generate valid paths without accessing any external navigation tools. Experiments show that LLMs outperform off-the-shelf baselines and specialized trajectory recovery models, with strong zero-shot generalization. Fine-grained analysis shows that LLMs have strong comprehension of the road network and coordinate systems, but also pose systematic biases with respect to regions and transportation modes. Finally, we demonstrate how LLMs can enhance navigation experiences by reasoning over maps in flexible ways to incorporate user preferences. Large Language Models (LLMs) are increasingly recognized as general-purpose systems, showing strong performance across domains ranging from mathematics and coding to vision and robotics. An emerging yet underex-plored question is whether these models possess geospa-tial understanding, the ability to reason about maps, paths, and spatial relationships. Such capabilities are fundamental to many real-world applications, e.g., autonomous vehicle navigation, logistics, and urban planning. While prior work has studied LLMs in contexts such as geographic knowledge retrieval (Manvi et al., 2024a;b) and map-based multiple-choice question answering (Dihan et al., 2025), the ability of LLMs to read road networks and plan paths has not been systematically evaluated. We investigate whether LLMs can perform navigation through the trajectory recovery task: reconstructing masked segments of GPS traces from the road network context, to bypass the restriction of relying on shortest path-type of ground truth which may not reflect human navigation pattern in practice (Golledge, 1995; Duckham & Kulik, 2003). Our dataset is framed in away that is harder than the traditional point-wise trajectory recovery task (Newson & Krumm, 2009; Song et al., 2017; Si et al., 2024), and closer to the higher-level navigation problem.


FBFL: A Field-Based Coordination Approach for Data Heterogeneity in Federated Learning

Domini, Davide, Aguzzi, Gianluca, Esterle, Lukas, Viroli, Mirko

arXiv.org Artificial Intelligence

In the last years, Federated learning (FL) has become a popular solution to train machine learning models in domains with high privacy concerns. However, FL scalability and performance face significant challenges in real-world deployments where data across devices are non-independently and identically distributed (non-IID). The heterogeneity in data distribution frequently arises from spatial distribution of devices, leading to degraded model performance in the absence of proper handling. Additionally, FL typical reliance on centralized architectures introduces bottlenecks and single-point-of-failure risks, particularly problematic at scale or in dynamic environments. To close this gap, we propose Field-Based Federated Learning (FBFL), a novel approach leveraging macroprogramming and field coordination to address these limitations through: (i) distributed spatial-based leader election for personalization to mitigate non-IID data challenges; and (ii) construction of a self-organizing, hierarchical architecture using advanced macroprogramming patterns. Moreover, FBFL not only overcomes the aforementioned limitations, but also enables the development of more specialized models tailored to the specific data distribution in each subregion. This paper formalizes FBFL and evaluates it extensively using MNIST, FashionMNIST, and Extended MNIST datasets. We demonstrate that, when operating under IID data conditions, FBFL performs comparably to the widely-used FedAvg algorithm. Furthermore, in challenging non-IID scenarios, FBFL not only outperforms FedAvg but also surpasses other state-of-the-art methods, namely FedProx and Scaffold, which have been specifically designed to address non-IID data distributions.


Open Challenges in the Formal Verification of Autonomous Driving

Burgio, Paolo, Ferrando, Angelo, Villani, Marco

arXiv.org Artificial Intelligence

In the realm of autonomous driving, the development and integration of highly complex and heterogeneous systems are standard practice. Modern vehicles are not monolithic systems; instead, they are composed of diverse hardware components, each running its own software systems. An autonomous vehicle comprises numerous independent components, often developed by different and potentially competing companies. This diversity poses significant challenges for the certification process, as it necessitates certifying components that may not disclose their internal behaviour (black-boxes). In this paper, we present a real-world case study of an autonomous driving system, identify key open challenges associated with its development and integration, and explore how formal verification techniques can address these challenges to ensure system reliability and safety.


MP-PINN: A Multi-Phase Physics-Informed Neural Network for Epidemic Forecasting

Nguyen, Thang, Nguyen, Dung, Pham, Kha, Tran, Truyen

arXiv.org Artificial Intelligence

Forecasting temporal processes such as virus spreading in epidemics often requires more than just observed time-series data, especially at the beginning of a wave when data is limited. Traditional methods employ mechanistic models like the SIR family, which make strong assumptions about the underlying spreading process, often represented as a small set of compact differential equations. Data-driven methods such as deep neural networks make no such assumptions and can capture the generative process in more detail, but fail in long-term forecasting due to data limitations. We propose a new hybrid method called MP-PINN (Multi-Phase Physics-Informed Neural Network) to overcome the limitations of these two major approaches. MP-PINN instils the spreading mechanism into a neural network, enabling the mechanism to update in phases over time, reflecting the dynamics of the epidemics due to policy interventions. Experiments on COVID-19 waves demonstrate that MP-PINN achieves superior performance over pure data-driven or model-driven approaches for both short-term and long-term forecasting.


Large Language Models Reflect the Ideology of their Creators

Buyl, Maarten, Rogiers, Alexander, Noels, Sander, Dominguez-Catena, Iris, Heiter, Edith, Romero, Raphael, Johary, Iman, Mara, Alexandru-Cristian, Lijffijt, Jefrey, De Bie, Tijl

arXiv.org Artificial Intelligence

Large language models (LLMs) are trained on vast amounts of data to generate natural language, enabling them to perform tasks like text summarization and question answering. These models have become popular in artificial intelligence (AI) assistants like ChatGPT and already play an influential role in how humans access information. However, the behavior of LLMs varies depending on their design, training, and use. In this paper, we uncover notable diversity in the ideological stance exhibited across different LLMs and languages in which they are accessed. We do this by prompting a diverse panel of popular LLMs to describe a large number of prominent and controversial personalities from recent world history, both in English and in Chinese. By identifying and analyzing moral assessments reflected in the generated descriptions, we find consistent normative differences between how the same LLM responds in Chinese compared to English. Similarly, we identify normative disagreements between Western and non-Western LLMs about prominent actors in geopolitical conflicts. Furthermore, popularly hypothesized disparities in political goals among Western models are reflected in significant normative differences related to inclusion, social inequality, and political scandals. Our results show that the ideological stance of an LLM often reflects the worldview of its creators. This raises important concerns around technological and regulatory efforts with the stated aim of making LLMs ideologically `unbiased', and it poses risks for political instrumentalization.


A model learning framework for inferring the dynamics of transmission rate depending on exogenous variables for epidemic forecasts

Ziarelli, Giovanni, Pagani, Stefano, Parolini, Nicola, Regazzoni, Francesco, Verani, Marco

arXiv.org Artificial Intelligence

In this work, we aim to formalize a novel scientific machine learning framework to reconstruct the hidden dynamics of the transmission rate, whose inaccurate extrapolation can significantly impair the quality of the epidemic forecasts, by incorporating the influence of exogenous variables (such as environmental conditions and strain-specific characteristics). We propose an hybrid model that blends a data-driven layer with a physics-based one. The data-driven layer is based on a neural ordinary differential equation that learns the dynamics of the transmission rate, conditioned on the meteorological data and wave-specific latent parameters. The physics-based layer, instead, consists of a standard SEIR compartmental model, wherein the transmission rate represents an input. The learning strategy follows an end-to-end approach: the loss function quantifies the mismatch between the actual numbers of infections and its numerical prediction obtained from the SEIR model incorporating as an input the transmission rate predicted by the neural ordinary differential equation. We validate this original approach using both a synthetic test case and a realistic test case based on meteorological data (temperature and humidity) and influenza data from Italy between 2010 and 2020. In both scenarios, we achieve low generalization error on the test set and observe strong alignment between the reconstructed model and established findings on the influence of meteorological factors on epidemic spread. Finally, we implement a data assimilation strategy to adapt the neural equation to the specific characteristics of an epidemic wave under investigation, and we conduct sensitivity tests on the network hyperparameters.


Uncertainty-aware segmentation for rainfall prediction post processing

Monaco, Simone, Monaco, Luca, Apiletti, Daniele

arXiv.org Artificial Intelligence

Accurate precipitation forecasts are crucial for applications such as flood management, agricultural planning, water resource allocation, and weather warnings. Despite advances in numerical weather prediction (NWP) models, they still exhibit significant biases and uncertainties, especially at high spatial and temporal resolutions. To address these limitations, we explore uncertainty-aware deep learning models for post-processing daily cumulative quantitative precipitation forecasts to obtain forecast uncertainties that lead to a better trade-off between accuracy and reliability. Our study compares different state-of-the-art models, and we propose a variant of the well-known SDE-Net, called SDE U-Net, tailored to segmentation problems like ours. We evaluate its performance for both typical and intense precipitation events. Our results show that all deep learning models significantly outperform the average baseline NWP solution, with our implementation of the SDE U-Net showing the best trade-off between accuracy and reliability. Integrating these models, which account for uncertainty, into operational forecasting systems can improve decision-making and preparedness for weather-related events.


Speech Analysis of Language Varieties in Italy

La Quatra, Moreno, Koudounas, Alkis, Baralis, Elena, Siniscalchi, Sabato Marco

arXiv.org Artificial Intelligence

Italy exhibits rich linguistic diversity across its territory due to the distinct regional languages spoken in different areas. Recent advances in self-supervised learning provide new opportunities to analyze Italy's linguistic varieties using speech data alone. This includes the potential to leverage representations learned from large amounts of data to better examine nuances between closely related linguistic varieties. In this study, we focus on automatically identifying the geographic region of origin of speech samples drawn from Italy's diverse language varieties. We leverage self-supervised learning models to tackle this task and analyze differences and similarities between Italy's regional languages. In doing so, we also seek to uncover new insights into the relationships among these diverse yet closely related varieties, which may help linguists understand their interconnected evolution and regional development over time and space. To improve the discriminative ability of learned representations, we evaluate several supervised contrastive learning objectives, both as pre-training steps and additional fine-tuning objectives. Experimental evidence shows that pre-trained self-supervised models can effectively identify regions from speech recording. Additionally, incorporating contrastive objectives during fine-tuning improves classification accuracy and yields embeddings that distinctly separate regional varieties, demonstrating the value of combining self-supervised pre-training and contrastive learning for this task.


Towards objective and interpretable speech disorder assessment: a comparative analysis of CNN and transformer-based models

Maisonneuve, Malo, Fredouille, Corinne, Lalain, Muriel, Ghio, Alain, Woisard, Virginie

arXiv.org Artificial Intelligence

Some research has been focused on using these models to automatically assess Head and Neck Cancers (HNC) significantly impact patients' the speech severity level [13, 14, 15]. Other studies analysed ability to speak, affecting their quality of life. Commonly how well diseases can be predicted by these models. For instance, used metrics for assessing pathological speech are subjective, A. Favaro et al. [16] compared interpretable speech prompting the need for automated and unbiased evaluation features to embeddings produced by SSL models on predicting methods. This study proposes a self-supervised Wav2Vec2-the presence of Parkinson's disease. They showed that based model for phone classification with HNC patients, to enhance using embeddings provides better detection accuracies at the accuracy and improve the discrimination of phonetic features cost of losing the insight into speech and language deterioration for subsequent interpretability purpose. The impact of given by interpretable features. While being able to detect pre-training datasets, model size, and fine-tuning datasets and a disease and assess its severity is important, we believe it parameters are explored. Evaluation on diverse corpora reveals is as important to interpret the output of these models, in order the effectiveness of the Wav2Vec2 architecture, outperforming to enhance trust that clinicians can have in these systems.


Segmentation of diagnostic tissue compartments on whole slide images with renal thrombotic microangiopathies (TMAs)

Vo, Huy Q., Cicalese, Pietro A., Seshan, Surya, Rizvi, Syed A., Vathul, Aneesh, Bueno, Gloria, Dorado, Anibal Pedraza, Grabe, Niels, Stolle, Katharina, Pesce, Francesco, Roelofs, Joris J. T. H., Kers, Jesper, Bevilacqua, Vitoantonio, Altini, Nicola, Schröppel, Bernd, Roccatello, Dario, Barreca, Antonella, Sciascia, Savino, Mohan, Chandra, Nguyen, Hien V., Becker, Jan U.

arXiv.org Artificial Intelligence

The thrombotic microangiopathies (TMAs) manifest in renal biopsy histology with a broad spectrum of acute and chronic findings. Precise diagnostic criteria for a renal biopsy diagnosis of TMA are missing. As a first step towards a machine learning- and computer vision-based analysis of wholes slide images from renal biopsies, we trained a segmentation model for the decisive diagnostic kidney tissue compartments artery, arteriole, glomerulus on a set of whole slide images from renal biopsies with TMAs and Mimickers (distinct diseases with a similar nephropathological appearance as TMA like severe benign nephrosclerosis, various vasculitides, Bevacizumab-plug glomerulopathy, arteriolar light chain deposition disease). Our segmentation model combines a U-Net-based tissue detection with a Shifted windows-transformer architecture to reach excellent segmentation results for even the most severely altered glomeruli, arterioles and arteries, even on unseen staining domains from a different nephropathology lab. With accurate automatic segmentation of the decisive renal biopsy compartments in human renal vasculopathies, we have laid the foundation for large-scale compartment-specific machine learning and computer vision analysis of renal biopsy repositories with TMAs.